UPI: A Primary Index for Uncertain Databases

نویسندگان

  • Hideaki Kimura
  • Samuel Madden
  • Stanley B. Zdonik
چکیده

Uncertain data management has received growing attention from industry and academia. Many efforts have been made to optimize uncertain databases, including the development of special index data structures. However, none of these efforts have explored primary (clustered) indexes for uncertain databases, despite the fact that clustering has the potential to offer substantial speedups for non-selective analytic queries on large uncertain databases. In this paper, we propose a new index called a UPI (Uncertain Primary Index) that clusters heap files according to uncertain attributes with both discrete and continuous uncertainty distributions. Because uncertain attributes may have several possible values, a UPI on an uncertain attribute duplicates tuple data once for each possible value. To prevent the size of the UPI from becoming unmanageable, its size is kept small by placing low-probability tuples in a special Cutoff Index that is consulted only when queries for low-probability values are run. We also propose several other optimizations, including techniques to improve secondary index performance and techniques to reduce maintenance costs and fragmentation by buffering changes to the table and writing updates in sequential batches. Finally, we develop cost models for UPIs to estimate query performance in various settings to help automatically select tuning parameters of a UPI. We have implemented a prototype UPI and experimented on two real datasets. Our results show that UPIs can significantly (up to two orders of magnitude) improve the performance of uncertain queries both over clustered and unclustered attributes. We also show that our buffering techniques mitigate table fragmentation and keep the maintenance cost as low as or even lower than using an unclustered heap file.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UPI: A Primary Index for Uncertain Databases Citation

Uncertain data management has received growing attention from industry and academia. Many efforts have been made to optimize uncertain databases, including the development of special index data structures. However, none of these efforts have explored primary (clustered) indexes for uncertain databases, despite the fact that clustering has the potential to offer substantial speedups for non-sele...

متن کامل

Association of Continuity of Primary Care and Statin Adherence

PURPOSE Deficiencies in medication adherence are a major barrier to effectiveness of chronic condition management. Continuity of primary care may promote adherence. We assessed the association of continuity of primary care with adherence to long-term medication as exemplified by statins. RESEARCH DESIGN We linked data from a prospective study of 267,091 Australians aged 45 years and over to n...

متن کامل

Relationship between transcranial Doppler and CT data in acute intracerebral hemorrhage.

BACKGROUND AND PURPOSE It would be useful to have a noninvasive test for correlation with CT findings in patients with intracerebral hemorrhage (ICH). We determined which transcranial Doppler (TCD) variables are related to which CT data in patients with ICH. METHODS We prospectively included 51 patients (age +/- SD, 66.2 +/- 12.4 years; 30 men, 21 women) with first-ever supratentorial, nontra...

متن کامل

Dynamics of mitochondrial inheritance in the evolution of binary mating types and two sexes

The uniparental inheritance (UPI) of mitochondria is thought to explain the evolution of two mating types or even true sexes with anisogametes. However, the exact role of UPI is not clearly understood. Here, we develop a new model, which considers the spread of UPI mutants within a biparental inheritance (BPI) population. Our model explicitly considers mitochondrial mutation and selection in pa...

متن کامل

Neuroprotective Effect of Gallic Acid on Memory Deficit and Content of BDNF in Brain Entorhinal Cortex of Rat’s Offspring in Uteroplacental Insufficiency Model

Introduction: Uteroplacental insufficiency (UPI) causes neurodevelopmental deficits affecting the intrauterine growth restricted (IUGR) offspring. This study aimed to analyze the effects of Gallic acid (GA) on memory deficit and brain-derived neurotrophic factor (BDNF) content in entorhinal cortex of UPI rat models. Methods: In this experimental study, 40 pregnant Wistar rats were randomly div...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2010